SemanticScuttle - klotz.me » Tags: fine tuning

Tags: fine tuning*

0 bookmark(s) - Sort by: Date ↓ / Title /

The article introduces a new approach to language modeling called test-time scaling, which enhances performance by utilizing additional compute resources during testing. The authors present a method involving a curated dataset and a technique called budget forcing to control compute usage, allowing models to double-check answers and improve reasoning. The approach is demonstrated with the Qwen2.5-32B-Instruct language model, showing significant improvements on competition math questions.

2025-02-14 Tags: arxiv, test-time scaling, budget forcing, llm, qwen2.5-32b-instruct, sft, fine tuning, reinforcement learning, machine learning, deepseek-r1 by klotz

How to Fine-tune Florence-2 for Object Detection Tasks

This article provides a step-by-step guide on fine-tuning the Florence-2 model for object detection tasks, including loading the pre-trained model, fine-tuning with a custom dataset, and evaluating the model's performance.

2024-06-26 Tags: florence-2, object detection, multimodal, llm, vision, microsoft, fine tuning by klotz

mistral-finetune - GitHub

A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA, a training paradigm where most weights are frozen and only 1-2% additional weights in the form of low-rank matrix perturbations are trained.

2024-06-06 Tags: github, mistral, lora, python, machine learning, fine tuning, llm by klotz

A Step-by-Step Guide to Representation Finetuning LLAMA3

"The paper introduces a technique called LoReFT (Low-rank Linear Subspace ReFT). Similar to LoRA (Low Rank Adaptation), it uses low-rank approximations to intervene on hidden representations. It shows that linear subspaces contain rich semantics that can be manipulated to steer model behaviors."

2024-05-26 Tags: linear subspace, lora, representation, fine tuning, reft, stanford, nlp, python, llm by klotz

Tuning Language Models by Proxy

Introduces proxy-tuning, a lightweight decoding-time algorithm that operates on top of black-box LMs to achieve the same end as direct tuning. The method tunes a smaller LM, then applies the difference between the predictions of the small tuned and untuned LMs to shift the original predictions of the larger untuned model in the direction of tuning, while retaining the benefits of larger-scale pretraining.

2024-05-11 Tags: proxy, fine tuning, llm, llama2-70b by klotz

mistral-doc: Fine-tuning an LLM on my ChatGPT conversations - Duarte O.Carmo

2024-04-14 Tags: llm, mistral, fine tuning by klotz

Lightning-AI's LitGPT Tutorial

GitHub repository for a tutorial series called "0 to LitGPT."
Provides an overview of how to get started with LitGPT, which is an open-source implementation of GPT-3.
Offers various resources such as codes, issues, pull requests, actions, security features, insights, and more related to the LitGPT project.

2024-03-28 Tags: llm, fine tuning, github, litegpt, tutorial by klotz

understanding-using-and-finetuning-gemma

2024-02-24 Tags: llm, gemma, fine tuning, google by klotz

Mistral Fine Tuning with QLora

2024-02-22 Tags: mistral, fine tuning, llm, qlora by klotz

Fine Tuning LLM on RTX 3090

Discusses the use of consumer graphics cards for fine-tuning large language models (LLMs)
Compares consumer graphics cards, such as NVIDIA GeForce RTX Series GPUs, to data center and cloud computing GPUs
Highlights the differences in GPU memory and price between consumer and data center GPUs
Shares the author's experience using a GeForce 3090 RTX card with 24GB of GPU memory for fine-tuning LLMs

2024-02-02 Tags: llm, fine tuning, nvidia, rtx, 3090, self-hosted by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: fine tuning*

Linked Tags

Related Tags